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Abstract. Iteratee IO is a style of incremental input processing with 
precise resource control. The style encourages building input processors 
from a user-extensible set of primitives by chaining, layering, pairing and 
other modes of compositions. The programmer is still able, where needed, 
to precisely control look-ahead, the allocation of buffers, file descriptors 
and other resources. The style is especially suitable for processing of com- 
munication streams, large amount of data, and data undergone several 
levels of encoding such as pickling, compression, chunking, framing. It 
has been used for programming high-performance (HTTP) servers and 
web frameworks, in computational linguistics and financial trading. 
We exposit programming with iteratees, contrasting them with Lazy IO 
and the Handle-based, stdio-like IO. We relate them to online parser com- 
binators. We introduce a simple implementation as free monads, which 
lets us formally reason with iteratees. As an example, we validate several 
equational laws and use them to optimize iteratee programs. The simple 
implementation helps understand existing implementations of iteratees 
and derive new ones. 

"We should have some ways of coupling programs like garden 
hose - screw in another segment when it becomes necessary 
to massage data in another way. This is the way of IO also." 
M. D. Mcllroy. October 11, 1964. 

1 Introduction 

Iteratee IO is a style of compositional incremental input processing with precise 
resource control. As such it is conducive to handling large amounts of data and 
programming of long-running servers. Iteratee IO has been proven in practice: it 
is employed in several commercially deployed web frameworks (e.g., [2]) has been 
used in financial trading applications [9] and natural language processing. Good 
performance of iteratee IO is seen from several benchmarks, web-related (in- 
cluded SNAP) and others [6, 10]. Performance, compositionality and high level 
of abstraction attracted attention. As of May 2011, there are three main imple- 
mentations of Iteratee IO on Hackage: 1 iteratee-0 . 8 . 3, enumerator- 0 .4. 10 
and the extensive iterlO, as well as several variations. Iteratee IO lends itself to 
efficient, online parser combinator libraries similar to [1, 13]. First introduced to 
Haskell [5], Iteratee IO has since been ported to F#, 2 Scala and other languages. 

1 http://hackage.haskell.org 

2 https : // github . com/f sharp/ f sharpx 
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The goal of Iteratee 10 is to overcome drawbacks of Lazy and Handle-based 
10 and combine their strong features. Lazy 10, an instance of memory-mapped 
10, is an elegant abstraction that effectively eliminates 10, giving program- 
mers an impression that the entire file is available in memory and may be ac- 
cessed as an ordinary string. There is no longer need to explicitly read, let alone 
worry about buffer allocations and underflows. The abstraction of a file as an 
in-memory string comes without guilt: behind the scene, the operating or run- 
time systems access the file efficiently, reading it on demand and sharing the read 
data. Lazy 10 facilitates compositional input processors like parser combinators. 

Lazy 10 is so irresistible that it was added to Haskell despite the reservations 
of its inventors and the failure to develop good techniques for reasoning about 
its correctness [7, Sec 10.5]. However benign, reading is an observable side-effect, 
whose occurrence may have to be correlated with other side effects. Such cor- 
relations are crucial when performing 10 over communication pipes, which is 
typical of web servers. 3 As Launchbury and Peyton Jones feared, Lazy 10 in- 
deed "gives rise to a very subtle class of programming errors". We have seen 
deadlocks; mishandling of 10 errors; running out of file descriptors and similar 
scarce resources; unpredictable, volatile and sometimes unbearably excessive use 
of memory. We illustrate the splendors and miseries of Lazy 10 in §2. 

Handle-based 10 is the stdio-style 10 familiar from C. It is 'strict': 10 op- 
erations must be explicitly requested. Therefore, it affords precise control of 
resources and the detection of all 10 errors. However, it is very low-level: every 
read operation is painfully explicit. Handle-based 10 hides the buffering, provid- 
ing the abstraction for a stream of characters. The abstraction does not extend 
to a stream of other data types, and does not support stream embeddings. The 
programmer must be constantly aware of the current file position, which makes 
it tortuous to process layered streams or combine parsers to process the same 
stream in parallel. §2 illustrates these problems as well. 

One wishes for a set of abstractions that free programmers from thinking 
about 10, and yet provide facilities to control buffering, look ahead, locking, etc. 
at those moments where it matters. One wishes to derive these abstractions and 
optimize them by algebraic transformations based on equational laws. 

Iteratee 10 is an approach to this ideal, amalgamating ideas going back to 
the 10 of Haskell 1.3; Kernel-Prolog's iterator objects [15] uniformly representing 
files, in-mcmory collections and processes; the resumption monad, surveyed in 
[3], and generators of Alphard [12] (which now live as Java streams and genera- 
tors in modern languages.) Like Handle 10, Iteratee 10 gives error handling, the 
precise control over important operations and resource allocations, incremental 
processing and high performance. Like Lazy 10, it gives high-level abstractions, 
encapsulating input processing layers that can be nested and composed sequen- 
tially or in parallel. Iteratee 10 turns out to offer reasoning principles letting us 
derive implementations and optimize them. 

Although all implementations of Iteratee 10 follow the same principles, there 
are many variations based on historical accidents, handling of buffering, levels 



3 POSIX memory-mapped 10, mmap, does not work with communication pipes either. 
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of generality. There are even more tutorials, varying in generality and points of 
view [6, 8, 16-18]. The desire for a standpoint to grasp the idea of iteratees, to 
assess and derive their variations and to reason with them is the motivation for 
the present paper. 

The contributions and the structure of the paper We argue that the 
essence of Iteratee 10 is captured by two inter-related views: stream processing 
network and incremental online parser combinators. Based on the final co-algebra 
model of stream processors, for the first time we formulate algebraic laws, which 
let us derive and simplify iteratee parser combinators. 

§2 introduces Iteratee 10, by using an Iteratee library to write a progression 
of examples abstracted from web server programming and computational linguis- 
tics. We use the examples to contrast Iteratee 10 with Lazy and Handle-based 
10 and to give them informal semantics. 

§3 defines the semantics formally, as an interpretation of the data type denot- 
ing iteratees. The semantics lets us view iteratees as parsers. The rich algebraic 
structure of the iteratee data type - final co-algebra and free monad - gives 
rise to algebraic laws, which let us build and reason about iteratee programs 
compositionally. Appendix B of the full paper details optimizing iteratee parser 
combinators using the equational laws. 

Appendix A proves the equational laws in a more general setting of effectful 
iteratees - in which input processing is accompanied by effects in some monad. 
Buffering and look-ahead are two particular examples of such an effect. This 
insight clarifies the implementation of buffering in iteratee libraries - which so 
far has been the most confusing feature. 

More material about iteratees, including demonstrations, tutorials and ref- 
erences to iteratee libraries are available online at http://okmij.org/ftp/ 
Streams .html. The annotated source code for all the examples in the paper 
can be found in http : //okmi j . org/f tp/Haskell/Iteratee/, which is the base 
URL for all code files referenced in this paper. 

2 Programming with Iteratees 

This section introduces programming with iteratees on a series of progressively 
more complex examples. We stress compositionality - assembling input proces- 
sors from previously written or library components. We appeal to the intuitions 
of more familiar Lazy 10 and Handle 10 when explaining iteratees. Therefore, 
the examples are also written in Lazy 10, and, when feasible, Handle 10. The 
contrast lets us see the advantages of Lazy and Handle 10 that Iteratee 10 in- 
herits, and the drawbacks it is designed to overcome. (In particular, we shall sec 
Lazy IO's unexpected, huge memory consumption and wasting sparse resources 
like file descriptors.) 

The examples revolve around reading potentially very large text and count- 
ing specific words and whitespace. The final example, abstracted from interactive 

4 http: //okmi j . org/f tp/Haskell/Iteratee/describe .pdf 
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systems, tests orchestration: reading a communication pipe up to a terminator 
but not a byte further. The complete code for all examples - with tests and 
the extra details left out of the paper - is available online (IterDemol .hs). The 
code uses the IterateeM library available from the same site. Figure 1 lists 
the interface of the library fragment used in this section, pointing out related 
functions from the Haskell standard library. A different series of illustrative ex- 
amples, counting lines and words and searching for the first or all occurrences 
of a word - implementing wc and grep - are given in IterDemo .hs. The latter 
set of examples illustrates error handling and the encapsulation of state. 

type Iteratee el m a a processor of the stream of els 

in a monad m yielding the result of type a 

instance Monad m => Monad (Iteratee el m) 
instance MonadTrans (Iteratee el ) 

getchar :: Monad m => Iteratee el m (Maybe el) cf. lO.getChar, List . head 

countJ :: Monad m => Iteratee el m Int cf. List . length 

run :: Monad m => Iteratee el m a — > m a extract Iteratee ' s result 

A producer of the stream of els in a monad m 

type Enumerator el m a = Iteratee el m a — > m (Iteratee el m a) 
enumJile :: FilePath — > Enumerator Char 10 a Enumerator of a file 

A transformer of the stream of elo to the stream of eli 

(a producer of the stream eli and a consumer of the stream elo ) 

type Enumeratee elo eli m a = 

Iteratee eli m a — > Iteratee elo m (Iteratee eli m a) 

enjilter :: Monad m => (el — > Bool) — > Enumeratee el el m a 

take :: Monad m => Int — > Enumeratee el el m a cf. List, take 

enum_words :: Monad m => Enumeratee Char String m a cf. List. words 

Kleisli ( monadic function ) composition: composing enumerators 

(2> ) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c) 

Connecting producers with transformers (cf. (=<C )) 

infixr 1 . | right— associative 

(. | ) :: Monad m => 

(Iteratee el m a — > w) — > Iteratee el m (Iteratee el ' m a) — > w 

Parallel composition of iteratees ( cf. List . zip ) 

en_pair :: Monad m => 

Iteratee el m a — > Iteratee el m b — > Iteratee el m (a,b) 

Fig. 1. The interface of the IterateeM library fragment 

We start lightly, with counting whitespace characters. The Lazy 10 code 
pattern-matches on the ordinary string (cf. more elegant code later): 
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countWSJazy :: String — > Int 
countWSJazy "" =0 

countWSJazy (c:str ) | isSpace c = 1 + countWSJazy str 
countWSJazy (_:str ) = countWSJazy str 

There are no appearances of any 10 operations, which is the main appeal of 
Lazy 10. The 10 is banished to the highest-level code, which opens a file and 
"reads it all" into a string, to hand out to countWSJazy. 

run_countWSL fname = readFile fname 3>= print o countWSJazy 

The run-time system reads the file on demand, so that the counting runs in 
constant, and small memory. 

The Handle 10 code is in the style of stdio, familiar to C programmers. It, 
too, "pattern-matches" on the input stream. 

countWS_handle :: Handle -> 10 Int 
countWSJiandle h = loop 0 
where 

loop n = try (hGetChar h) 3>= check n 

check n (Right c) = loop (if isSpace c then n+ 1 else n) 

check n (Left e) | Just ioe <— from Exception e, 

isEOFError ioe = return n 
check _ (Left e) = throw e 

run_countWSH fname = 

bracket (openFile fname ReadMode) hClose $ \h — > 
countWSJiandle h ^>= print 

It, too, runs in small and constant memory. Error handling stands out. We now 
differentiate EOF (end-of-file) from other 10 errors, which is impossible with 
Lazy 10. Also unlike Lazy 10, the code is explicit about closing the file, ensuring 
that the file be closed (10 errors or not) before run_countWSH returns. 

The Iteratee 10 code below should look quite similar to the earlier exam- 
ples. The intuition of pattern-matching on the stream still applies; the stream is 
implicit however. Since there is no explicit 'handle', errors like reading from an 
already closed handle become impossible. 

countWSJter :: Monad m =>■ Iteratee Char m Int 

countWSJter = loop 0 tail— recursive 

where loop n = getchar >= check n 
check n Nothing = return n 

check n (Just c) = loop (if isSpace c then n+ 1 else n) 

We have written an iteratee: it reads the stream of characters and produces an 
Int. Polymorphism over the monad m tells that the iteratee (like countWSJazy) 
is pure. The library iteratee getchar 5 (sec Figure 1) is quite like try (hGetChar h) 

5 Although some Iteratee libraries indeed provide something like getchar, we implement 
it ourselves, in IterDemol .hs. §3 explains the idea of the implementation and gives 
a simpler version, called oneL in Figure 2. 
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in the Handle 10 code: getchar reads a character from the stream and returns it; 
the result Nothing signifies EOF. The Iteratee library takes care of detecting and 
propagating other 10 errors. Iteratee el m is a monad, letting us write iteratee 
code in the standard monadic style (again compare with the Handle 10 code 
above) . 

This is how we count whitespace in a file: 

run_countWSI fname = print =<C run =<C enumJile fname countWSJter 

Here, enum.file is an enumerator: it enumerates a file, passing file data to an 
iteratee. The precise enumerator/iteratee interaction is described in §3. A user 
of an Iteratee library should find sufficient the stdin intuition: the iteratee getchar 
is quite like Prelude. getChar, which reads a character from the buffer containing 
the current chunk of the standard input. If the buffer is empty, OS is requested 
to fill it in. Our en urn file opens the file on the 'standard stream' and plays the 

05 for the iteratee, reading a chunk when the iteratee asks to fill its buffer. 
When the file is exhausted, or when the iteratee stops asking for more data, 
the iteratee, encapsulating the resulting state, is returned. The resulting iteratee 
cannot hold any references to the file: in fact, an iteratee cannot know if its 'stdin' 
data come from a file or other source. Therefore, enumJile's closing the file upon 
return is safe. 6 The function run tells the iteratee that the stream is finished 
and extracts its result, the integer counter in our case. (App. C gives another, 
plumbing intuition for Iteratee 10.) Like Lazy 10, the file is read incrementally 
and on demand. Like Handle 10, the file be closed when enumJile returns, 10 
errors or not. Explicit bracketing is not needed. Like Handle 10 (and unlike Lazy 
10), iteratees support precise error handling and accounting of sparse resources 
like file descriptors. Unlike Handle 10, the boring details are hidden away. 

Lazy 10 permits a far more elegant solution: a one-liner, using the standard 
Prelude functions on lists: 

countWS' Jazy :: String — > Int 
countWS' Jazy = length o filter isSpace 

We filter out the characters other than whitespace, and count the remainder. 
As expected for pure Haskell, filtering and counting is done incrementally and 
lazily. The intermediate list is never fully constructed. Iteratee 10 matches the 
algorithm and the elegance: 

countWS' Jter :: Monad m =>■ Iteratee Char m Int 
countWS'Jter = id . | (en.filter isSpace) countJ 

We use three new library primitives, Figure 1: the iteratee countJ, like length, re- 
turns the length of the stream. The enumeratee en Jilter is a stream transformer. 
The type of Enumeratee elo eli m a almost fits the pattern of Enumerator eli m a: 
indeed, the enumeratee is an enumerator for the inner stream (of eli-type el- 
ements), taking data from the outer stream. That is, Enumeratee elo eli ma 
converts a stream of elos into a stream of elis. The conversion is not nec- 
essarily in lock-step, as is the case for en filter: although the outer and the 

6 In that respect, enumJile is similar to Scheme's with-input-from-file. 
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inner streams have elements of the same type, the inner streams has gener- 
ally fewer elements. Although conceptually 'stream' is an (infinitely) long se- 
quence of elements, at any given time only a single, small chunk of the stream 
is present in memory. Our enJilter requests a chunk from the outer stream 
and creates a filtered chunk, to pass to an iteratee. All stream conversion is 
strictly incremental. There is not even a chance of producing a large interme- 
diate data structure, and no need to trust laziness or GHC fusion rules. The 
combinator (.|), akin to run, 'runs' the enumeratee, that is, terminates the 
inner stream and extracts the result of the inner iteratee, passing it to the con- 
sumer, the left argument of (. | ). (There are cases, not described in the paper, 
where the inner stream should be left untcrminated so it can be passed to an- 
other enumeratee: e.g., processing of HTTP chunk-encoded streams.) To count 
whitespace in a file, we write enumJile fname countWS'Jter. The equational law 
f (g . | h) = (f o g) . | h gives enumJile fname . | enJilter isSpace countJ , 
which resembles the Unix pipeline. 

We modify the example to count the occurrences of the word "the" (assuming 
the input is text with words of bounded size). Lazy 10 code is most straightfor- 
ward, relying on Prelude. words to parse the input string into a list of words. We 
filter out words other than "the" and count the remainder. Thanks to Haskell 
laziness, the whole operation runs in constant space. 

countTHEJazy :: String — > Int 

countTHEJazy = length o filter (= "the") o words 

Handle 10 code for this example is complex. Not only should we search 
for "the", we also have to make sure the character before and after (if exist) 
is whitespace. On the top of it, we have to deal with errors and EOF. The 
simplest solution is to explicitly write the Finite State Machine recognizer, see 
IterDemol .hs for details. The result is too big to put in the paper, reminding us 
that Handle 10 is really low level. An abstraction is direly needed, for example, 
in a form of a lexer generator - or Iteratee 10. 

Here is the Iteratee 10 code (Recall, ( . | ) is right-associative) 

countTHEJter :: Monad m =>■ Iteratee Char m Int 

countTHEJter = id . | enum_words . | enJilter (= "the") countJ 

It is quite like Lazy 10, converting a character stream to a word stream to the 
filtered stream, which is then counted. 

Let us extend the example so to count the word "the" within a sequence of 
files, as if they are concatenated. We shall count "the" even if it is split between 
two files. The Lazy 10 code re-uses the previously written counting function 
countTHEJazy, which now receives a string that is the concatenation of all files' 
contents. 

run_manyTHEL fnames = 
mapM readFile fnames ^>= print o countTHEJazy o concat 

The code is elegant; the processing is incremental, reading only one file at a 
time. Alas, we have to open all files first! The action readFile, which opens a file 
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and prepares it for lazy reading, is performed in sequence on all files - prior to 
counting. Since counting is pure, we cannot execute 10 actions like readFile in 
the counting code. Therefore, we need as many file descriptors as there are files 
in the fnames list. If the list is long, we may run out of file descriptors. Ideally, 
however, we only need one file descriptor, opening and closing it as we go. We 
get the first intimation of Lazy 10 resource mis-management - with no interface 
to correct. 

Handle 10 code to count in multiple files is even more complex than the 
single-file counter. We do not show the code for the lack of space. 

The Iteratee 10 code again looks quite like Lazy 10 code, re-using the pre- 
viously written iteratee countTHEJter. Klcisli (monad functional composition) 
(^> ) builds an enumerator from two others, effectively sending to an iteratee 
first the chunks of the first stream and then the chunks of the second. In short, 
composing enumerators concatenates their sources. We elaborate on that prop- 
erty and state it formally, in §3. 

run_manyTHEI fnames = print =<C run =<C 
foldrl (^> ) (map enumJile fnames) countTHEJter 

(one can use the regular foldr keeping in mind that the unit of (;g> ) is return). 
Unlike Lazy 10, only a single file descriptor is used during the whole counting; 
only one file is open at any given time. 

As a test of compositionality, we combine the two counting operations and 
count "the" and whitespace, together. Lazy 10 code is elegantly straightforward. 

run_countPairl_ fname = do 
str <— readFile fname 

print (countWSJazy str, countTHEJazy str) 

Here we run into one of Lazy 10 pitfalls: the counting is no longer incremental. 
The whole file is loaded in memory. For applications processing large files or long 
streams, Lazy 10 is too unreliable for use in production. 

The Iteratee code also re-uses previously written counters. It pairs them, 
relying on the parallel composition of itcratccs en.pair. 

run_countPairl fname = 
print =<C run =<C enum_file fname (countWSJter 'en_pair ' countTHEJter) 

One may think of en_pair as 'splitting' (or duplicating) the stream. In reality 
en.pair does no copying or buffering: it receives a chunk and passes it to its two 
argument itcratccs. If both iteratees want more data, a new chunk is requested. 
Unlike Lazy 10, the processing remains incremental and in constant memory: 
As we read a block from file, we send the block to two iteratees. 

Our final example demonstrates early, prior to EOF, termination. We modify 
the previous "the" and the whitespace counter to count only within the prefix 
of the stream of the size at most N. This example is abstracted from reading 
HTTP request content with the explicitly specified Content-Length. We should 
not attempt to read even a single byte after N since a web client expects the 
reply first, before it sends the next request. If we attempt to read ahead after 
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N bytes, the deadlock ensues. The lazy 10 code uses the Prelude. take to lazily 
obtain the prefix of the file content, which is processed as before. 

run_ntermL n fname = do 
strO <— readFile fname 
let str = Prelude. take n strO 
print (countWSJazy str, countTHEJazy str) 

As in the previous example, this counting does not run in constant memory. 
There are bigger problems. First, since we generally stop reading before EOF, 
the run-time system will not close the file descriptor. It will be closed when 
the corresponding finalizcr is run, which may happen very late. Leaking of file 
descriptors puts us in danger of running out of them, which indeed happens in 
practice when using Lazy 10 with programs that process lots of files. Most serious 
is the real danger of a deadlock. The run-time system may speculatively read- 
ahead, at any time and for any reason. The programmer has no way whatsoever 
to control this read-ahead or even be informed about it. Deadlock does routinely 
happen in practice, when using lazy 10 for interactive services. 7 

Lazy 10 was designed to give the impression that 10 is not even happen- 
ing. When dealing with communication pipes and request-response servers, even 
reading is an observable effect. The precise control of reading actions is crucial. 
Lazy 10 becomes a wrong abstraction. 

The Iteratee 10 code, like the earlier Lazy 10 code, differs from the previous 
run_countPairl in one change, take - from IterateeM rather than Prelude. 

run_nterml n fname = 
print =<C run =C enum_file fname . | 
IterateeM. take n (countWSJter ' en.pair ' countTHEJter) 

Like en_filter, the cnumeratee take substreams its outer stream, namely, takes the 
prefix of the size at most n. As soon as take n gets its n elements, it stops asking 
for more data, prompting its enumerator, enum.file, to close the file. The Iteratee 
code is just as concise as the Lazy 10 code; both are quite alike. Since 10 is now 
done strictly, the iteratee code gives full control over file opening, closing, and 
reading. (iterDemol . hs has another early termination example, reading the file 
up to the first occurrence of a given string.) 

We have seen that both Lazy 10 and Iteratee 10 allow assembling of the 
whole program from independent building blocks. Both 10 styles permit the 
incremental processing, reading file data on demand. Because Iteratee 10 is not 
lazy, the Iteratee library can ensure timely deallocation of resources, precise 10 
error handling, precise control of reading actions. Iteratee 10, unlike Lazy 10, 
guarantees the incremental processing. 

3 Enumerators and the semantics of iteratees 

This section outlines the conceptual design of iteratees, viewing iteratees and 
enumerators as communicating sequential processes. Iteratee processes are mod- 

7 http : //www . haskell . org/pipermail/haskell- cafe/2008- August/046532 . html 
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eled as a data type; enumerators become interpreters, thus denning the seman- 
tics of iteratees as parsers of an enumerator's source. We show the compositional 
construction of iteratee-parsers and elucidate the algebraic laws that help design 
iteratee parser combinators and simplify iteratee programs. The accompanying 
code with derivations and several examples is available at IterDeriv.hs. 

Our running example is reading lines from the standard input until the empty 
line, returning them in a list. The example is part of the common task of reading 
HTTP or e-mail headers. A line is a maximal sequence of non-newline characters. 
First, we write the example in the familiar C-style, with getChar - or its non- 
exceptional version getcharO, which, like the one in the C standard library, returns 
the current character or EOF. 

type LChar = Maybe Char lifted character 

getcharO :: 10 LChar 

The function to read one line is later used to read all lines up to the empty line: 

getlineO :: 10 String 
getlineO = loop "" 

where loop acc = getcharO ^>= check acc 

check acc (Just c) | c ^ '\n' = loop (c: acc) 

check acc _ = return (reverse acc) 

getlinesO :: 10 [String] 
getlinesO = loop [] 

where loop acc = getlineO ^>= check acc 
check acc "" = return (reverse acc) 
check acc I = loop (I : acc) 

We may view getlineO and getlinesO as processes receiving lifted characters on 
a dedicated channel stdin and terminating with a value (a line or a list of lines). 
The simplest model represents such processes as a data type with a variant for 
each process operation - finished or inputting a character (see [4] for a good 
explanation of such modeling). We will call these processes iteratees 8 . 

data I a = Done a 

| GetC (LChar -> I a) 

Here is the data type model of the line reader 

getline :: I String 

getline = loop "" 
where loop acc = GetC (check acc) 

check acc (Just c) | c / '\n' = loop (c: acc) 
check acc _ = Done (reverse acc) 

which looks almost identical to getlineO. However, Done and GetC merely rep- 
resent process operations. Terms like getlines hence do not "do" anything; they 

8 For the origin of the name, see http://okmij.org/ftp/Scheme/ 
enumerators- callcc .html. 
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have to be interpreted so to run the corresponding process. Our interpreter uses 
a given finite string as the source of characters to send to the process. 

eval :: String — > I a — > a 

eval "" (GetC k) = eval "" $ k Nothing 

eval (c:t) (GetC k) = eval t $ k (Just c) 

eval str (Done x) = x 

When the string is exhausted, the process is sent EOF (that is, Nothing). Hope- 
fully the process then finishes and we can return the produced result. One may 
view eval str i as a Unix pipeline cat str I i. 

The data type I a has a rich structure. I a is a final co-algebra of the functor 
T{X) = J \J r x LChar - which helps us prove algebraic laws of iteratees below. The 
data type represents finitely branching trees with finite and infinite branches. 9 
The interpreter eval s traces the path s in the tree. Last but not least, I a is a 
free monad (for good explanation and references, see [14]): 

instance Monad I where 
return = Done 

Done x 3>= f = f x 

GetC k >= f = GetC (k ^> f) 

(see Figure 1 for Kleisli composition (;g> )). Therefore, we may build iteratee 
processes by chaining simpler ones with the monadic (^>= ) operation. For ex- 
ample, we chain getlines to build the process model of the reader of lines; the 
result looks identically to getlinesO: 

getlines :: I [String] 
getlines = loop [] 

where loop acc = getline >= check acc 

check acc "" = return (reverse acc) 
check acc I = loop (I : acc) 

The next step is simple but momentous: we factor out eval into two inter- 
preters, separating out the sending of data from the sending of EOF. The first 
factor feeds the characters, until there are no more data or the iteratee process 
is finished. The resulting iteratee is returned. The second interpreter tells the 
iteratee that there arc no more data, and extracts its result. 

en_str :: String — >• I a — > I a 

en_str "" i = i 

en_str (c:t) (GetC k) = en_str t $ k (Just c) 

en_str _ (Done x) = Done x 

run :: I a — > a 

run (GetC k) = run $ k Nothing 

9 Recall that Haskell data types are co-inductive, letting us construct infinite terms, 
such as getline. 
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run ( Done x) = x 

Clearly, eva I str = run o en_str str. We call en_str enumerator, and its argument 
str a source. The function run is analogous to eof of the UUparsing library [13]. 
Enumerators and run of the IterateeM library, Figure 1, look more general than 
en_str and run above since IterateeM lets iteratees and enumerators perform side 
effects in a monad m, when obtaining and processing input data. See App. A for 
such a generalization. 

The point of factoring out eva I is obtaining interpreters that can be function- 
ally composed. Furthermore, 

Equational Law 1 (Composition) 

enstr (si -H- s2) = enstr s2 o en_srr si 

In words: the composition of enumerators corresponds to the concatenation of 
their sources. The law holds for more general effectful enumerators and iteratees, 
see App. A for the formulation and proof. If we overlook the last clause of en_str's 
definition, en_str is an instance of foldr (which inspired the names 'iteratee' and 
'enumerator'). The law of composition therefore is hardly surprising. 

Another law of en_str illustrates the compositionality of the iteratee semantics 
and lets us view iteratees as parsers and build parser combinator libraries. An 
iteratee is a value of the data type I a, which per se has "no semantics". The 
interpreter en_str gives a semantics to iteratees, as a function from finite strings 
to either Done v or GetC k. When the result of en_str s i is Done v, we say that 
the iteratee i has recognized the string s, parsing it to the value v. It follows from 
the law of composition that if i has recognized the string s, i recognizes s 4f s2 
for any s2. We say that i properly recognizes the string s if i recognizes s but not 
any proper prefix of s. 

Equational Law 2 (Chaining) If iteratee i properly recognizes si, then 
en_str (si -ff s2) (i >= f) = enjstr si i >= en_str s2 o f 

The proof of the general version of this law is given in App. A. 

The law of chaining tells us how to build a recognizer for a string from the 
recognizers of the string's prefix and suffix, thus defining the meaning of the 
sequential iteratee composition (>•= ). To represent choice we need a parallel 
composition: the left-biased alternation combinator. 

(<):: I a ^ I a -> I a 
Done x < _ = Done x 
_ < Done x = Done x 

GetC kl < GetC k2 = GetC (\c -> kl c < k2 c) 

The parser il < i2 recognizes whatever the first finishing parser recognizes; in 
the event of a tie, the result of il is preferred. Whereas the left and right unit 
of (»= ) is Done, the left and right unit of < is failure, Figure 2, which keeps 
requesting input even after receiving EOF. It is a "diverging iteratee": run failure 
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diverges. (In IterateeM library, run reports an error if an iteratee asks for data 
after receiving EOF.) 

Equational Law 3 (Zero) The failure is the left zero of bind 
failure ^>= f = failure 

Equational Law 4 (Right distributivity) 

/ >= \x -> (kl x < k2 x) = (i >= kl) < (7 >= k2) 

The law is similar to the law L10 in the parallel parser combinator library [1]. 
Since < commits to whatever a parser recognizes first, the left distributivity 
does not hold: 

(il < i2) >= k ^ il >= k < \2 >= k 

Primitive parsers 

failure :: I a The parser of nothing 

failure = GetC (const failure ) 

empty :: a — > I a The parser of the empty string 

empty v = Done v 

oneL :: I LChar The parser of one lifted character 

oneL = GetC Done 

Parser combinators: chaining and alteration 

(»= ) :: I a -> (a -> I b) -> I b 
(o ) :: I a -> I a -> I a 

Derived parsers 

The parser of a one— character string 

one :: I Char 

one = oneL ;^>= maybe failure return 

The parser of a character satisfying the given predicate 

pSat :: (LChar -»■ Bool) -> I LChar 

pSat pred = oneL \c — > if pred c then return c else failure 

Fig. 2. Parser combinator library for simple iteratees 
We thus arrive at the simple parser combinator library, Figure 2, which lets 
us derive the parsers getline and getlines that we previously built by intuition. 
First, we use the library to write the line reader in the 'obviously correct' way, 
expressing our definition of a line: 

pGetline :: I String 

pGetline = nl < liftM2 (:) one pGetline 
where nl = do 
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pSat (\c ->• c = Just '\n' | | c = Nothing) 
return 

It is quite inefficient: if the current character is not newline, nl turns into failure, 
which does nothing but keeps receiving characters and discarding them. Such 
wasteful operations should be eliminated. Noting that pSat and one start with 
oneL that can be factored out by the right distributivity, and applying the laws 
of failure gives us 

pGetline' :: I String 
pGetline ' = oneL >= check 
where check (Just '\n') = return 

check Nothing = return "" 

check (Just c) = NftM (c:) pGetline' 

(See App. B for the complete equational derivation.) This iteratee is a non- 
tail-recursive version of getline that we wrote ad hoc earlier with explicit process 
constructors GetC and Done. (The tail-recursive conversion through accumulator 
is standard.) The correctness of getlines follows from the law of chaining. 

The parser combinators in Figure 2 are efficient: the input stream is consumed 
character-by-character and is never backtracked. These parser combinators are 
somewhat similar to camlp4 parsers [11] in structuring a parser as a team of 
concurrent simple recursive-descent 'stream parsers'. Figure 2 library races the 
stream parsers in parallel until one succeeds. Camlp4 orders stream parsers by 
precedence and lets a higher-precedence parser run for a as long as it could. 
Camlp4 relics on look-ahead, which iteratee parser combinators in this section 
do not have (although it could be emulated in the continuation-passing style). 

Look-ahead, or a fixed put-back, is an 'effect', to be expressed with effectful 
iteratees, see App. A. Effectful iteratees also permit a better error reporting. 
App. A thus lays out the way towards implementing the interface in Figure 1 
and running our illustrative examples, §2. 

4 Conclusions 

We have introduced Iteratee 10, a compositional incremental input processing 
style with precise resource control. Like Lazy 10, it provides high abstraction, 
composability, combinator libraries, and on-demand 10. Because Iteratee 10 is 
not lazy, the Iteratee library can ensure timely deallocation of resources, precise 
10 error handling, and strict control of reading actions. Incremental processing 
can now be guaranteed. Iteratee 10 is therefore particularly suitable for program- 
ming long-running servers and processing large amounts of data. Compared to 
Handle 10, Iteratee 10 is much higher level. 

We have presented a view of iteratees as processes, represented by final co- 
algebras and free monads. The view shows how to reason with iteratees and 
implement them, motivating the basic design of iteratees and explaining their 
compositions. The theory of effectful iteratees clarifies the vexing issues of buffer- 
ing and look-ahead. The iteratee libraries have many other features such as error 
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reporting, restartable exceptions, random 10, and merging of several streams. 
The iteratee-as-process view helps in understanding these advanced parts, too. 

The capabilities and applications of Iteratee 10 are still being discovered. 
For example, it was recently shown 10 that monadic regions and iteratees easily 
combine; therefore it is possible after all to write an exception-safe iterFile (an 
iteratee that writes the stream data to a file), ensuring the output file always 
closed. 

The theory of effectful iteratees hints at the possibility of reasoning about 
computations with arbitrary 10 effects (involving communication pipes, locking, 
shared memory, etc.), being very specific, at times, about the allocation of re- 
sources and the precise sequencing of operating system calls. We could derive 
observational equivalences of 10 programs by extending equivalences of simple 
sample programs asserted by the programmer. 

Even though Lazy 10 compromises equational reasoning, it was introduced 
because Haskell was perceived - by its creators - as not expressive enough for 
incremental high-level 10: "We fear that there may be no absolutely secure 
system - that is, which one guarantees the Church-Rosscr property - which is 
also expressive enough to describe the programs which systems programmers (at 
least) want to write..." [7, Sec 10.5]. The pessimism turns out unwarranted. We 
can write high-level programs with incremental 10 and precise resource control - 
in safe Haskell. 
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A Effective iteratees 

The data received by an iteratee may come from a file or a network. To get a 
chunk of that data, an enumerator had to perform 10. An iteratee may also 
need effects, e.g., to write a log, report exceptions, rewind the input stream. 
The theory of cffectful iteratees developed in this section applies to all these 
cases. The theory extends the final-coalgebra representation of iteratee processes 
introduced in §3. We generalize the equational laws stated in §3 and prove them. 
The full details are in the accompanying source code IterDerivM.hs. 

As in §3, we view iteratees as processes, modeled by a final co-algebra of their 
operations - terminated or requesting a character. After receiving a character, 
the iteratee may now incur an effect, in an arbitrary monad m: 

data I m a = Done a 

| GetC (LChar -> m (I m a)) 

The model of the line reader process below looks identically to that of getline in 
§3. Indeed, this line reader had no effects besides requesting a character. 

getline :: Monad m =>■ I m String 
getline = loop "" 
where 

loop acc = GetC (check acc) 

check acc (Just c) | c / '\n' = return (loop (c: acc)) 
check acc _ = return (Done (reverse acc)) 

Let us introduce an effect, of emitting a string: 

class Monad m =>■ PutS m where 
putS :: String — >• m () 

instance PutS 10 where 
putS = putStrLn 

so that we may model a line reader that writes the debugging trace of each 
received character: 

getlineT :: (PutS m, Monad n)^ln String 
getlineT = loop "" 
where 

loop acc = GetC (trace acc) 

trace acc c = putS ("got J' -ff show c) ^> check acc c 
check acc (Just c) | c ^ '\n' = return (loop (c: acc)) 
check acc _ = return (Done (reverse acc)) 

The interpreters of the iteratee term representation - enumerators and run - 
are defined as before, §3. The presence of the "call-by-value application" (=C ) 
reveals that the evaluation order now matters. 

en_str :: Monad m =>■ String -4lma^m(lma) 
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en_str "" i = return i 

en_str (c:t) (GetC k) = en_str t =< k (Just c) 

en_str _ i@Done{} = return i 

run :: Monad m =>■ I m a — > m a 

run (Done x) = return x 

run (GetC k) = run =<C k Nothing 

As in §3, monadic operations let us compose iteratee processes: I m is a 
monad - a free monad. 

instance Monad m =>■ Monad (I m) where 
return = Done 

Done x ^>= f = f x 

GetC k >= f = GetC (k ^> (return o (>= f))) 

Somewhat surprisingly (since monads do not generally compose), the composi- 
tion of m and I m is also a monad, with the following bind operation 

type IM m a = m (I m a) 

bind :: Monad m IM m a -> (a -> IM m b) -> IM m b 
bind m f = m ^>= check 
where 

check ( Done x) = f x 

check (GetC k) = return (GetC (\c -> bind (k c) f)) 

We hence combine the logging line reader getlineT to read several lines until the 
empty line: 

getlinesT :: (PutS m, Monad m) =>• I m [String] 
getlinesT = loop [] 
where 

loop acc = getlineT 3>= check acc 
check acc "" = return (reverse acc) 
check acc I = loop (I : acc) 

Here is the complete example of reading lines from a given string, printing each 
character as it is being processed. 

till = print =<C run =C en_str " abd\nxxx\nf ' getlinesT 

The equational laws of iteratees and enumerators, §3, generalize to the ef- 
fcctful case. 

Equational Law 5 (Effectful Composition) 

en_str (si 4f s2) = enjstr si ^> en_str s2 

Here si must be a finite string; s2 is arbitrary. The law reads just like the original 
law of composition, this time, in terms of Kleisli composition. Let us prove it. 
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If we pass the Done x itcratce to the enumerators on both sides of the equality, 
the results are clearly equal: en_str s (Done x) = return (Done x) regardless of 
s. The other case is the enumerators applied to a GetC k iteratee, which we prove 
by induction on si. In the base case, the empty si, en_str "" is return, which is 
the unit of Klcisli composition. The inductive case: 

en_str {(c:sV) 4f s2) (GetC k) 

= property of list append 

en_str (c:(sl' 4f s2)) (GetC k) 

= second clause ofenstr 

en_str (si' 4f s2) =< k (Just c) 

= inductive hypothesis 

(en_str si' ^> en_str s2) =< k (Just c) 
= definition of (^W> ) 

k (Just c) >= (\x ->• en_str si' x >= en_str s2) 

= associativity of bind 

(k (Juste) >=en_str si') >= en_str s2 

= second clause of en_str 

en_str (c:sl') (GetC k) >= en_str s2 

= definition of (^> ) 

(en_str (c:sl') ^> en_str s2) (GetC k) 

The law of chaining of §3 becomes: 

Equational Law 6 (Effectful Chaining) 1. If the iteratee i properly recog- 
nizes si, then 

en_str (sJ-H- s2) (i >= f) = en_str si i 'bind' en_str s2 o f 

2. If the iteratee i does not recognize s, then 

en_str s (i »= f) = en_str s i »= (return o (>= f)) 

The proof of part 1 is by induction on si, which must be finite by the definition 
of proper recognition. In the base case, the iteratee i properly recognizing the 
empty string si is Done x and so en_str si i is return (Done x), which is the left 
unit of bind. In the inductive case, the iteratee i properly recognizes the string 
c:sl'. Therefore, i must have the form GetC k where k (Just c) is an action that 
must yield an iteratee i' properly recognizing si'. We calculate: 

en_str ((c:sl') 4f s2) (GetC k >= f) 

= property of list append 

en_str (c:( si' 4f s2)) (GetC k >= f) 
= definition of bind of (I m) 

en_str (c:( si' 4f s2)) (GetC (k ^> (return o (>= f)))) 

= second clause of en_str definition 

en_str (si' 4f s2) =< (k ^> (return o (>= f))) (Just c) 
= definition of ) 

en_str (si' 4f s2) =< (k (Just c) >= (return o (>= f))) 
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= rearrangements 

(k (Just c) >= (return o (>= f))) >= en_str (sl'-H- s2) 
= associativity of bind 

k (Just c) >=(\i' -^return (i ' >= f ) >= en_str (sl'-H-s2)) 
= left unit law 

k (Just c) >= (\i' en_str (si' 4f s2) (i' >= f)) 

= inductive hypothesis: i ' does properly recognize si ' 

k (Just c) »=(\i' -> en_str si' i' 'bind' en_str s2 o f) 

= a property of bind: m 'bind' f = m 3>= (\x — > return x 'bind' f) 

k (Just c) >= (\i' 

en_str si' i' >= (\x -> return x'bind' en_str s2 o f)) 

= associativity 

(k (Juste) >=en_str si') >= 

(\x — > return x 'bind' en_str s2 o f) 
= second clause of en_str definition 

en_str (c:sl') (GetC k) >= (\x -> return x'bind' en_str s2 o f) 

= t/)e same property of bind 

en_str (c:sl') (GetC k) ' bind' en_str s2 o f 

The proof of part 2 is analogous, see IterDerivM.hs for the complete derivation. 
Since the proofs relied only on monad laws, the laws of effectful composition and 
chaining hold for any effect whatsoever. 
The divergent failure iteratee now reads 

failure :: Monad m =>■ I m a 

failure = GetC (const (return failure )) 

The law Zero of §3 remains the same for effectful iteratees: failure ^>= f = failure. 
The proof is trivial bisimulation. 

The left-biased alternation of effectful iteratees has the form 

(<] ) :: Monad m^lma^lma^lma 
i@Done{} < _ = i 
_ < i@Done{} = i 

GetC kl < GetC k2 = GetC (\c -> liftM2 (< ) (kl c) (k2 c)) 

To state the right-distributivity law, we need a definition: An iteratee i is idem- 
potent if 

en_str s i 3>= \x — > return (x, x) = 

en_str s i ^>= \x -4- en_str s i ^>= \y -4- return (x, y) 

for any finite string s. The right distributivity law has the same form as given 
in §3 - with the side-condition that i must be an idempotent iteratee. 

The proof is by bi-similarity. We define the relation R on iteratees as a set 
of all pairs (i A, i B) where 

iA = i >= \x -)• (kl x < k2 x) 
iB = (i >= kl) < (i >= k2) 



21 



and we consider all observations of related iteratees. Here we only show the case 
when i is of the form GetC k for some k, in which case iA is GetC kA and iB is 
GetC kB for some kA and kB. The full proof is in IterDerivM.hs. If we feed iA 
a character c, we observe 

case GetC k >= \x ->• (kl x < k2 x) of GetC kA ->• kA c 

= definition on (^S>= ) 

(k ^> (return o (>= \x -> (kl x < k2 x)))) c 

k c >= \i' 

return (i ' >= \x -> (kl x < k2 x)) 

For iteratee iB, we have 

case (GetC k >= kl) < (GetC k >= k2) of GetC kB ^ kB c 
= definitions 

case GetC (k ^> (return o (>= kl))) < 

GetC (k ^> (return o (>= k2))) 
of GetC kB -)• kB c 

liftM2 (< ) 

(k c >= \ix -> return (ix >= kl)) 

(k c >= \iy -> return (iy >= k2)) 

= definition of HftM2 

(k c >= \ix -> return (ix >= kl)) >= \il ->■ 
(k c >= \iy -> return (iy >= k2)) >= \i2 ->■ 
return (il < i2) 

= associativity , unit laws 

k c >= \ix -» k c >= \iy -> 
return ((ix >= kl) < (iy >= k2)) 
= monad unit law 

k c ^>= \ix — > k c \iy — > return (ix , iy ) ^>= 
\ (ix.iy) ^return ((ix »= kl) < (iy >= k2)) 

= idempotence 

k c ^>= \ix — > return (ix , ix ) ^>= 

\ (ix.iy) ^return ((ix »= kl) < (iy >= k2)) 

= monad laws 

k c >= \i' -> 

return ((i' >= kl) < (i ' >= k2)) 

Thus feeding c to the related iA and iB incurs the same effect (associated with 
k c) and produces the iteratees that are also related by R. 

Treating look-ahead as an effect, the file IterDerivM.hs generalizes the 
parser combinators of Figure 2 for look-ahead. Buffering, the processing of input 
by chunks rather than by individual characters, can be handled similarly. 
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B Deriving the optimal version of pGetline 

This section shows the detailed steps in the conversion of the "obviously correct" 
but grossly inefficient line reader pGetLine, §3, to the efficient version pGetline'. 
The conversion relies on the equational laws of iteratees. 

Our starting point is pGetline written using the iteratee parsing combinators 
from Figure 2: 

pGetline :: I String 

pGetline = nl < liftM2 (:) one pGetline 
where nl = do 

pSat (\c -> c = Just '\n' | | c = Nothing) 
return 

First, we inline the definitions of one and pSat and desugar the do-notation: 

pGetlinel = nl < char 
where 

nl = (oneL >= \c ->• if c = Just '\n' | | c = Nothing 

then return c else failure ) > return "" 
char = (oneL >= maybe failure return) >= \c — > liftM (c:) pGetlinel 

We re-associate the bind chains to the right: 

pGetline2 = nl <d char 
where 

nl = oneL »= (\c — >• if c = Just '\n' | | c = Nothing 

then return c 3> return 

else failure » return "") 
char = oneL >= (\c -> maybe failure return c »= \c -> 
liftM (c) P Getline2) 

distribute bind into case and apply Monad and Zero laws: 

pGetline3 = nl < char 
where 

nl = oneL »= (\c — >• if c = Just '\n' | | c = Nothing 
then return 
else failure ) 
char = oneL >= (\c -» case c of 

Just c liftM (c:) pGetline3 
Nothing failure ) 

and the right-distributivity law: 

pGetline4 = oneL >•= \c — > nl' c <3 char' c 
where 

nl ' c = if c = Just \n' | | c = Nothing 

then return "" else failure 
char' c = case c of 



23 



Just c -> NftM (c:) pGetline4 
Nothing — »■ failure 

We pull out the case analysis on the read character, essentially "narrowing" 

pGetline5 = oneL 3>= check 
where 

check (Just '\n') = return "" < NftM ('\n':) pGetline5 

check Nothing = return "" < failure 

check (Just c) = failure < liftM (c:) pGetline5 

The facts that failure is the left and the right unit of < , and return x is its left 
zero give us pGetLine'. 

C The plumbing intuition for Iteratee IO 

The diagrammatic notation for iteratee programs introduced in this section helps 
visualize the flow of input data, giving an idea of iteratee processing at a glance. 
The notation is inspired by the "Piping and Instrumentation Diagram Standard 
Notation" 11 used in Industrial Engineering for a similar purpose. For illustration, 
we show the diagrams for the examples in §2. 

-O*** -L_h~* 

abed 



A 



e 




H 
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Fig. 3. Notation for primitive components and combinators 

Figure 3 describes the notation for the Iteratee library components, Figure 1. 
Enumerator (a) is a pump, pumping data so long as it can flow. When the con- 
sumer is saturated and will not accept more data, the pressure rises and the 
pump shuts off. A general iteratee (b) is a reservoir with an overflow pipe. When 
the reservoir is filled up (i.e., the iteratee is Done), the further input data flow 
through the overflow pipe to the next iteratee in chain. The overflow pipe is 
shut by default. When the reservoir fills up (that is, the iteratee gets all data 
it needs) and there is no further iteratee, the data stream has nowhere to flow, 
the pressure rises and the pump (enumerator) shuts off. The iteratee getchar (c) 
is a small reservoir, which can only hold a single byte. In contrast, countJ (d) 
is an open reservoir, accepting any amount of input data. The pairing combi- 
nator en_pair is the Y-connector (e), splitting the stream in two. Enumcratcc is 

11 See the example, https://controls.engin.iimich.edu/wiki/index.php/ 
PIDStandardNotation 
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a reactor (f), transforming the incoming (top) stream to the stream of 'reactor 
products'. When the bottom flow stops, that is, the iteratee consuming the re- 
actor product finishes, the incoming flow continues through the right (overflow) 
end. The combinator ( . | ) (g) terminates that overflow pipe. There are cases (not 
shown in the paper) when the overflow continues: for example, when processing 
multi-part MIME messages. Kleisli composition of enumerators connects pumps 
in sequence: 




We now show the plumbing diagrams for the examples in §2. We start with 
the simplest pipeline (too simple to mention in §2) that measures the output of 
a pump: enumJile fname countJ. 




The white-space gauge: 
countWS' Jter = id . | (en filter isSpace) countJ 



— I 

I filter 




The gauge for "the" : 
countTHEJter = id . | enum_words . | enJilter ( = "the") count_i 




The counter of both "the" and the whitespace in the prefix of the input 
stream 

run_nterml n fname = 
print =<C run =<C enumJile fname . | 
IterateeM. take n (countWSJter ' en_pair ' countTHEJter) 



drawn as follows. 



